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(54) Description frameworic for audiovisual content 

(57) A system and method for providing a descrip- 
tion framework for an audiovisual presentation system. 
The system includes an interfiace that allows the user to 
consume several different representations of audiovis- 
ual content. The system also Includes a descriptive 
structure that identifies and locates the summary 
selected by the user. The user is presented with a multi- 
view menu of the available types of summaries and 
selects a summary type, and the system provides sum- 
maries of that type to the user. The summary descrip- 
tion servtee is provided to the user (60) based upon 
user preferences and history. When audiovisual mate- 
rial is then transmitted to the user (60), the description 
service provides the user (60) with the summary 
description that allows the user (60) to make and con- 
sume summaries of the nrtaterial. 
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Description 

BACKGROUND OF THE INVENTION 

5 [0001 ] This appHcation is a continuation of US Provisional AppHcation No. 60^1 54,389, filed 9/1 6/99 and dainns pri- 
ority thereto. 

1, F1el(<Qfthe Invention 

10 [0002] This invention relates to descriptions of audio-visual material. More particularly, this Invention relates to a 
method for specifying descriptions that allow users to navigate amongst different audiovisual material, and browse and 
experience the content of a particular audiovisual program, quickly and effectively. 

2. Badcground of the Invention 

IS 

[0003] Digital audiovisual material is becoming increasingly available to users through digital TV broadcast, digital 
video cameras, DVD, and PC-based access to multimedia on the Internet. In addition, per^sterrt large-volume storage 
and storage that allows non-linear access to audiovisual content, such as hard disk storage in powerful PC platforms 
and personal video recorders (PVR), is becoming available in consumer devces. Consequently, there is a need for 
20 rapid navigation and browsing capabiRties to enable users to efficiently discover and consume the contents of audiovis- 
ual material or programs. 

[0004] Users would also benefit from having non-linear access to different views of a particular program, a feature 
not cunnently available. The views could be adaptive to user's personal preferences, interests or usage conditions, such 
as the amount of time the user wants to spend in consuming the content, or the resources available to the user's temni- 

25 nal. Such adaptability would enhance the entertainment and educational value of audiovisual Information. 

[0005] This proliferation of audio-visual material available to users has the potential to overwhelm the viewer and 
lead to frustration at the inability to browse and view content in an efficient manner Viewing summaries of the content 
allows the viewer/user to skip In'elevant content arid locate the desired content quk^kly and easily Further, multiple dif- 
ferent summaries, it available, could provide the user wrtii alternative views of a partteular program tiiat ttie user could 

30 choose from depending on personal preferences or usage conditions. 

[0006] This capability is appearing more frequently In newer technologies, such as the digital video disk (DVD). 
DVD movies provide 'scene selections' or 'chapter selections' that have a visual anay of thumbnails and textual tities 
associated with each scene. This allows the user to click on the thumbnail of the desired scene, jump to that scene and 
begin playback. Playback typk»lly continues until the end of the movie, unless the viewer makes another selection. 

35 [0007] However, this technology remains limited, providing only the capability to index for the purpose of jumping to 
an ari3itrary position and continuing playback from that position. Additionally, these are only cun-ently available for mov- 
ies and cannot be provided for other types of audk>-visual content, such as home movies, or recordings of realtime 
broadcast of television. This capability can be seen as a visual index, a simple form of a summary description. 
[0008] A system in whk:h such summaries and descriptions can be used is discussed in co-pending US Patent 

40 Application No. 09/299,81 1 , filed 4/26/99. and owned by the assignee of ttiis applfcation and incorporated by reference 
herein. The system discussed functions in a typical audiovisual system including several devfces such as a television, 
cable or satellite reception, a sound system, etc. The term system refers to both indh^dual devices and systems of sev- 
eral of these devk^es. 

[0009] However, the reference does not provkte certain aspects of implementation of such a system, including 
45 models for usage and provision of content and servtoes. 

SUMMARY OF THE INVENTION 

[001 0] One aspect of the Invention is a system that provides a descriptive f rameworic about programs presented by 
so an audiovisual system. The framework includes an interface allowing a user to view representattons of audiovisual 
material and a descriptive structure that identifies and locates each of the representations of audiovisual material and 
data associated with the representation. Examples of such representattons could be a multimedia titie description and 
summary descriptions. 

[001 1] Another aspect of the invention is a method for providing alternative summaries to the user having ttie steps 
55 of presenting a multi-view menu of the available types of summaries to the user. The summaries can be hierarchteal or 
non-hierarchical. A user selection of a summary type is received and the selected summaries are provided. 
[0012] Yet another aspect of the invention is a method of providmg summary description services of audiovisual 
content to a user. lnfomriatk>n is received from the user including specifications of platform resources at the user end 
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and user preferences. Usage history of the user is tracked and used in conjunction with the specifications and prefer- 
ences to transmit audiovisual material to the user with associated summary descriptions. The summary descriptions 
can be provided in such a way that the user can send the summary descriptions to other users. 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

[001 3] For a more complete understanding of the present invention and for further advantages thereof, reference is 
now made to the following Detailed Description taken in conjunction with the accompanying Drawings In whk:h: 

10 Rgure 1 shows a block diagram representation of an audiovisual presentation system in accordance with the inven- 
tion. 

Figure 2 shows a method for selectton of an audiovisual program within a descriptton f rameworic in accordance with 
the Invention. 

15 

Rgure 3 shows a block diagram representation of a summary description scheme in accordance with the invention. 

Rgure 4 shows a block diagram representatton of alternative summaries available within a sunrtmary description 
scheme in accordance with the invention. 

20 

Rgure 5 shows a block diagram representation of alternative summaries available within a hierarchk^l summary 
description scheme in accordance with the invention. 

Rgure 6 shows a flow chart for one embodiment of proviston of audiovisual servtees in accordance witti the inven- 
25 Won. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0014] As mentioned previously, an overall system for creating and managing description schemes for an audlovis- 
30 ual presentation system is disclosed in co-pending US Patent Application No. 09/299,81 1 , filed 4/26/99. 

[0015] In this system, the video, image and/or audio information, which will be refemed to as presentation informa- 
tion is made available to a user and/or a system. The infomnation is presented to the user from the system such as a 
television or radio. The user or user's agent interacts with the system to rec^ve the information in a desirable manner 
and to define preferences as to what type of infomnation is obtained. The term user will refer to the end recipient of the 
35 infonnation, which could be a person, a machine or a software program running on a machine, as examples. 

[0016] To define these interactions, a set of description schemes containing data describing each component is 
defined, with reference to an overall audiovisual presentation system 1 0 as shown in Rgure 1 . The user preferences 1 2 
are used in several different areas to maximize both the user's enjoyment and the system utility to the user. The prefer- 
ences describing the topKs and subject matter of interest to the user is used in both searching for and filtering the audi- 
40 ovisual programs 14. These two sets of data, the user preferences and program descriptions 14, are correlated in the 
filtering and search, engine 16 to identify the preferred programs. 

[0017] The programs identified by the filtering and search engine 16 is then sent to a browsing module 18, along 
with the user's browsing preferences. Another output from the filtering and search engine 16 are preferred programs 
that the user has designated for storage. These are stored in storage module 20. The programs selected by the user 
45 with the browsing module are then sent to the display 22. The user may utilize multimedia titie descriptions of preferred 
programs to navigate among the programs that the user wants to consume. Once a program is selected, summary 
description of that particular program is correlated with user's browsing preferences to offer the user preferred alterna- 
tive summaries. 

[0018] The display 22 receives the programs and displays them in accordance with the user's devk^e preferences 
50 as to the operation of the display. User's devk:e preferences may include, for example, device settings such as volume 
setting that may vary with the genre of the program that is being watched. The display and user's interaction with the 
display, such as stopping a program before its end and watching certain types of programs with certain device settings, 
also provides infonnation in a manner analogous to a feedback loop to update and log tiie usage history 24. The usage 
history 24 can be mapped against the preferences by mapping module 26. This infomnation is then used in conjunction 
55 with user inputs by the user preference nfK)dule 1 2. 

[0019] These documented user preferences can be useful in several contexts, not just an audiovisual presentation 
system. The user preferences and usage history confomi to a specified format similar to that of the description of the 
audiovisual program information and can therefore be accumulated in the system as usage history information for fur- 
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ther use in selecting the contents desired by the user. Furthenmore, the usage history infonnation can be transmitted to 
the provider of the audiovisual programs 14 to receive a selected audiovisual program or directly receive audiovisual 
program summaries selected by the user. In the latter case, user preferences are correlated with summary descrqptions 
at the provider side to select and directly deliver summarized audiovisual progrants to the user. The preferences and 
5 summary descriptions and so on could also be transferred to a 'smart card' 28 or similar, portable storage means and 
ultimately transferred to another system by the user. However, the details of this type of transfer are beyond the scope 
of the current Invention and are only mentioned for completeness. 

[0020] In this context, there are several description schemes, which were the subject matter of the previously men- 
tioned co-pending patent application. However, that discussion was at the overall system level, and no franrtework for 

10 the individual descriptions schemes at the user, program or device level were considered. This invention is intended to 
provide, as a technique of realizing effident navigation and browsing of audiovisual programs using their summaries, a 
title description scheme capable of including multimedia information and a summary description scheme for describing 
hierarchical summaries of an audiovisual program and, furthermore, to construct a system and a service model utinzing 
description data based on the above description schemes. 

IS [0021] As shown in Rgure 1, the audiovisual programs 14 include descriptions of the programs in a description 
framework. The description framework can have several different types of descriptive structures. Of particular interest 
here are the multimedia titie description and the summary descr^on. The framework can contain either one of these 
types, both of these and either one or botii in combination with other types of program descriptions including metadata. o\ ^ 
For example, metadata on the creation of the program (director, actors, language, etc.), and genre of the program can ^ f 

20 be provided. 

[0022] In operation, the user manipulates the descriptive structures to select audiovisual programs presented by 
the presentation system of Rgure 1 . This view and select process can occur in several ways, as shown in Figure 2. For 
ease of discussion and understanding, one could view the description framework like an electronic library. The user 
could browse and search the programs by their tities, analogous to the multimedia titie description, or by a more robust 

25 summary, analogous to the summary descriptions. The descriptive structures such as the multimedia trtie description 
and the summary descriptions can be in one of several forms, including text, audio dips, video dips, still images, etc. 
[0023] Summary descriptions enable rapid navigation and browsing in this system. In partteular, summary descrip- 
tions enable key-frame summaries, event based vkJeo summaries that group video segments containing certain events, 
and video highlight summaries of particular duration. These summary descriptions's schemes contain references to tiie 

30 audiovisual media and its segments, frames, and audio tracks that can be efficiently utilized by a presentation engine 
in rendering different summaries and views of a program. Hence, when an audiovisual program has multiple versions 
of its summary descriptfon, the system can subsequentiy generate respective views by means of the presentation 
engine using each of the summary description versions. Consequentiy, the system provkies an efficient means for 
using multiple views of a program without the need for pre-storing its multiple versions in a separate storage area, thus 

35 realizing saving in an area for storing data at the system side. 

[0024] The temi summary description as used here refers to summaries that confomn to a set of rules for such sum- 
maries. The syntax, semantics, and rules of for these summary descriptions are contained in summary description 
schemes. Summary description schemes specify whk^h descriptors and attributes can be used In the description, their 
allowed range of values and the rules for their combinatfons. The use of common set of description schemes and 

40 descriptors would enable interoperability between different devces (i.e., devices for content providers, devbes for con- 
tent creators, devices for service providers and devtees for users) that handle audio-visual content These different 
devk^es would all be able to interpret summaries that use the same description schemes and descriptors. Ideally, the 
scheme would allow the different devices flexibility in how it presents ttie contents of the summaries to the viewer. 
[0025] A partknjiar audiovisual program today is often created rich In media. In particular, it may have a still image, 

45 graphic, short video clip, an audio jingle, or a pictorial logo assodated with it, whk^h concisely represent its content 
Such media can be used along with the usual textual title of a program. For example, a musk: program may have a pic- 
torial titie in addition to its textual titie; a TV program may have a logo or an audio jingle. Rg. 3 illustrates an example of 
a description scheme for integrating the data such as a text, an audio dip, a video dip and a still image into infonnation 
assodated witii a titie. In tiie descriptive structure of Rg. 3, a titie is described insMe (Titie Texl>...(mtie Text) like a titte 

so in conventional text data, while infonmation for locating multimedia data such as an audio clip, a video clip and a stilt 
image is described inside (Title Image).. .(/Title Image). This creates a description scheme enabling collection of the con- 
ventional text data and the information for locating multimedia data The use of the description scheme enables the sys- 
tem to extract, for example, image data from a storage area for storing multimedia such as image, develop and add the 
data. Namely, the system not only presents text date but also prepares a multimedia title easily. The multimedia title 

55 descriptions can fecirrtate an audiovisual, infonmative, effective, and entertaining navigation between different audiovis- 
ual programs. 

[0026] In tills titie description scheme, ttie multimedia data is represented by infonnation for locating the multimedia 
date. This enables the system to prepare multimedia date infonnation not only by directly referring to the multimedia 
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data stored in the system but also by specifying a frame number of an AV program stored in the system, specifying the 
beginning time and the ending time of a video dip and an audio cfip or refening to multimedia data being at the side of 
provider outside the system. 

[0027] Consequentty, the system that does not contain original multimedia infonmation can prepare a multimedia 
5 title description by referring the location of the multimedia data in the description data This eliminates the need for stor- 
ing the multimedia infonnation at the system side, achieving the saving in data storage area thereof. If the data stored 
outside the system is necessarily used multiple times, the system can read first and then use the data as the internal 
data to assure rapid presentation of the data. 

[0028] Multimedia title description may be integrated into the sunrvnary description, or may be a separate descrip- 
10 tion for a part'cular program, witiiout impacting the functionality of multiple media tities and summaries. Once a user 
chooses a program of interest as a result of navigation through multimedia titles, the user may utilize the summary 
description for that program to quickly discover the audiovisual content of the program, browse the program, or non-lin- 
eariy navigate within the program. The relationship between these descriptive structures will be discussed with refer- 
ence to Rgure 2. 

75 [0029] As shown in Rgure 2. the user can start by experiencing the multimedia title descriptions at step 3D. The 
user can then select a title and hence an audiovisual program and go to the next level of description of that audiovisual 
program, which would be the summary description in step 32. The user then makes a selection in step 34 and browses 
and experiences the program. 

[0030] Alternatively, the user could skip the summary description in step 32 and make a selection based only on 
20 the multimedia titie description. Another possibility allows the user to skip viewing the multimedia title descriptors and 
instead starting the selection process at the summary description level. The starting point for the user may be deter- 
mined by the amount of time available, any previous knowledge of the programs, and the desired amount of detail. Sim- 
ilarly, it is also possible to adaptively read first only description data such as multimedia titie data and summary data 
into the user system at Step S30 and obtain necessary audiovisual program data trough Step S32. 
25 [0031 ] An example of a multimedia titie description is shown below. 



<ntte> 

<TitleText> 

Afternoon news 
<nitleTttd> 
<Tittelmage> 

<MediaUnL>fle:Mhiifnbnaasftie^ 
<^r3loKfn3go> 

</Ttte> 



[0032] In order to understand the higher annount of detail used in the summary description, it is helpful to discuss it 

40 in more thoroughly. As shown in Rgure 3, audiovisual summaries are extracted from the audiovisual media 48 by the 
extraction module 46. Descriptions of these sumnnaries 44 are then authored according to the sunnmary description 
scheme whk^h specifies the elements, descriptors, attributes and other descriptions that can be included in the descrip- 
tion, the ranges of values that they can attain, and their allowable combinations of the elements, description and 
attn'butes. The summary description scheme includes a data description scheme for preparing a multimedia titie and a 

45 description scheme for presenting a summary description of an audiovisual program. The multimedia titie description 
and the summary description of an audiovisual program are prepared according to the above scheme. The displaying 
devtee 42, whether a user temninal or audiovisual device, receh/es the summary description and the audiovisual con- 
tent The device 42 includes a parser 50 that interprets and valkiates the audiovisual summary description 44 in accord- 
ance with the description scheme and presents the summaries to the user witti the interfaceSZ 

50 [0033] The sumnnary description scheme of Rgure 3 is shown in more detail in Rgure 4. The content of Rg. 4 is 
shown in more detail in Rg. 5. To satisfy the various kinds of requirements from users, content providers and servk:e 
providers, the present invention provides description schemes capable of describing a variety of summary descriptions 
as shown later. By using the description schemes, the provkler side and the user side may have a common framework. 
Hence, the user can select a desired summary description by utilizing the summaries provided by the provider. Exem- 

55 plifled alternatives selectalrfe by the user can be structured by hierarchfcally ananging a variety of descriptions such as 
"want to see a 1 0-second highlight scene", "want to see only a slam dunk shot" and the like. Furthermore, the provider 
can provides the user with a plurality of summary descriptions to meet the user preferences using the user usage his- 
tory information received from the user. These summary descriptions are structured to contain references to the audi- 
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ovisual media and its segments, frames, and audio tracks that can be efficientty utilized by a presentation engine in 
rendering different summaries and views of tfie program. The example of Rg. 4 illustrates in detail a summary descrip- 
tion scheme used for representing a variety of summaries as above described. The attribute 'summary type' 41 defines 
the type of summary description scheme 40. The use of this attribute enables the user to select either a hierarchical 
5 summary description scheme 54 or a sequential summary description scheme 56. The attributes of highlight or mul- 
tiresolution are both constructed with a hierarchical description scheme, shown in nrK>re detail in Figure 5. The hierar- 
chical sunrunary description scheme 54 will be described later. 

[0034] Figure 5 is a block diagram Illustrating the hierarchical summary description schemes selectable for prepar- 
ing summary descriptions. The scheme is as follows: The hierarchk»l summary description 54 is used to specify and 

10 group summaries of an audiovisual program, which may be structured hierarchically. It contains description data of a 
technique for constructing a hierarchy of an attribute 'summary type'. The hierarchical summary description 54 has plu- 
ral hierarchteal summary level descriptions. The hierarchical summary level descriptions are labeled and organized at 
different levels as shown below. Each level descn'bes a summary of the audiovisual program by information at a specific 
level. The hierarchical sunrtmary level descriptton is structured in such a way that it may have a furither hierarchical sum- 

15 mary level description to define a further deep level summary. In general, levels closer to the root of the hierarchy pro- 
vide coarse summaries and levels further away from the root provide more detailed summaries. 



^HterachicaiSurvnafy sunwnaiy7Vp e->1 i}^i y li V 

<Kfig^ifigMSe|^nerlL0vel i ui i m-ljeyBll J0oais6^ 

<l ft^iKt^tSBymefrtLewBl i luiiio ■ XevBl2_Mddta^ 
<n uyiiyypcjegfnerujBMep 
<H i g )i y |h t SegneniLBvel nameplje¥Bi2JMddte'^»^ 

<l-Bg^ii^tSegpTientfjBvel name-AjsweO^Rne^ 
^^/HghfightSegrneniLM^ 

^/HigNig^ttSegnnenlLo^^B^ 
</hfefachcalSiiTvnar^ 



[0035] The hierarchical summary descriptk)n thus structured enables the detailed summary description to include 
the coarse summary description, eliminating the duplication of the same data for representing the summaries. To view 
55 an audiovisual program using the summary description, the user can operate the presentation engine using a desired 
summary level description and a higher-level sunrvnary description. 

As shown in Rg. 5, the hierarchtoal summary level description contains references to audiovisual media and its seg- 
nrtents, frames and audio tracks. The reference may be made to segments and frames inside and outside the system. 
Hence, the system by itself can obtain audiovisual media data from external data stored at external provMers and/or 
40 inside the system in accordance with the selected suntmary and can provide the user with multiple views by using the 
presentation engine. When preparing a summary of a trailer for a serial film program, the description may contain a 
location of storing the proceeding audiovisual program and its highlight scene and time duration in addition to the pre- 
ceding audiovisual program and its highlight scene and time duration. This offers the advantage of saving data storing 
areas at t>oth the provider side and the user. 
45 The above hierarchical summary description has an attribute 'Hierarchy Type', which specifies the type of inten*elation 
between different levels of the summary. The attribute value can specify whether the hierarchy type is dependent or 
independent if the hierarchy type is "independent", tiie infomnation in a hierarchical summary level can completely 
specify a particular summary, without reference to the information in its parent element However, it has such a demerit 
that an amount of necessary data is increased. If hierarchy type is "dependent", the summary at a particular level can- 
so not be prepared without knowledge of its parent element In this case, a demerit of necessarily organizing data into a 
hierarchical system is caused but a merit of reducing the data amount is obtained. The user who desires a summary 
may select erttier of the types in accordance with the system specification. There is a description scheme for integrating 
ail the summaries having the at>ove-descrit>ed features. 

[0036] Audiovisual summaries enable users to consume alternative views of a particular audk>visual program 
55 where views can be chosen according to the amount of time available to the user, personal preferences and point of 
views, and amount of resources available to user's platfomri. These are achieved by summary descriptions using the 
above-described summary description schemes. 

[0037] In a hierarchical summary context where summaries can be at different hierarchy levels, grouping enables 
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summaries to be at the same hierarchy level and made available to the user as alternatives at that same level. For 
example, a multiview highlight summary may have a 30-second and a 60-second level for time constrained viewing at 
different detail/length levels. On the other hand, a multiview event summary enables summaries based on different 
events and different points of view where these summaries do not necessarily have a hierarchical relationship among 
5 them and thus they are merely alternatives. 

[0038] The names of the alternative summaries can be presented to tiie user in an interactive menu and the user 
selects the desired summary by using summary description data. 



10 

ALTERNATIVE SUMMARIES 



THREE POirNJT SHOTS 



SLAM DUNK 

IS 



MY FAVORITE MOMENTS 



[0039] An example description according to the above-described summary description schemes that will support 
the itenns in this menu is given below. Note that the above menu Items correspond to the 'Highlight Level Name' 
descriptors in the description. Indeed, the description may utilize a numerical, machine-readable code corresponding 
to the string that is presented to the user by the presentation engine. This is an implementation Issue. 
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<Prograni> 
<M8<fl&lnfonnation> 

<MttcBdPfpfiltt> 
<Medainstanca><LocatDi>IQs://d^ 

</MwflaProfile> 
^Mo<fialnfofrndtk>n> 
<Metaliiloffiuttiofi> 

<CmatiQn>critt6>81a20fa vs Pacers 2/1Q/99</Tittex/Crealion> 

<aas»fication><Genf«D€poits</Genre></Clas8i^^ 

</Metalnformation> 

<Sumniaiizatiof& 
<MerafchicaiSufnfnafy> 

<HarafchlcaiSunun8fy sunvnaryTypeslcayEvent^ names'VunMewr evem summai/^ 

<HJgttfgrTtLavel nameaThme Point Shots'> 
<HighfightSegmem namesHliree Pomt Slot #1'> 
<VkteoSeginentLocaloi>< Mo da r irnfl»1 0410 10680</MecfeTiin6><^ndeoSegmentU^^ 
<|inag8Locatof><MecSaTOTia»10600</M8c&Tb^ 
WftightightSogntQw^ 

<^Hi9lifi9^ittjOvel> 

<Highfightljsv8l names'Slam Ounks^ 
<^«ghfightSe9mem namea'Slafndunk tfl*^ 
<WeoSa9fnarilt j 0c ator >< Mo dtoT!me>t335D 13S60^MBdfaTTmq><^VMgoSegmentLocatDf> 
<lniag6Locator>^MecteT1ni8>13S00</M8dbT^^ 

<!-* more vkteo-soymenls «*> 

</H}ghOghtLsv8l> 

<HighOghtSegn)8nr nameamw Best 2-Poinl Shot from BbsoraTS 
<VideoSegmerittocataH><M8dblIme>tOt10 1O21O</l\40daTim»N«^videoSegmeiilLoc^^ 
<fandgetocatoi><AMtaTm>10180<Ale(&Tu^^ 

<Hi^)I}ghtSagnr«firnsm8=s'Scatty Pippin's Best BaskefV 
<Vt(feoSe9menllMaloi><MedlBTi^^ 17210</M«SaTfriM9<A/ldeoSegmen&jOGaloi> 
<tma9e f . oca?Di> <MtedfaTlnie>t6908</MedfaTim<><^! maf^ 



more vkSso-segments *-> 

<rniyiiiiyiiuj8vei> 

<MeffafdilealSumfflaiy> 
</Summarizsliofi> 
</Pro(|idni> 



[0040] Note that the menu above can facilitate bookmarking of a multitude of segments grouped under one theme, 
whk:h in this case is an event-based theme. The grouping (or bookmarking) is at a 'highlight level' and is at the same 
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hierarchical level. The last item in the menu ('My Favorite Moments") con^sponds to a fragment of the description that 
might have been generated by the user utilizing an appropriate authoring tool in his/her system. In other words, the user 
may have necessary tools to augment a summary that is available from a service provider, or tools to create a summary 
description from scratch. 

5 [0041] Segment'level bookmarking is also possible as seen in the last part (shaded) of the description shown 
above. The user in the above example has nnariced two segments as The Best 2'Point Shot from Blazers" and "Scotty 
Pippin's Best Basker*. These two bookmarks can be presented to the user in the form of either a separate menu of 
bookmarks, or a sutnnenu of the menu item "My Favorite Moments". 

[0042] Alternative summaries that are not necessarily hierarchk^lly structured are also allowed, e.g., a summary 
10 containing clips of "goals" vs. a summary contain dips of "passing shots". Such grouping of sunvnaries is necessary to 
allow different event views. 

[0043] A hierarchical summary refines a summary and is the root element of the hierarchical summary. A hierarchi- 
cal summary may contain multiple hierarchk:al summary level elements as shown below. Each hierarchk»l summary 
level element specifies a (hierarchical) summary and groups a number of video segments. These summaries represent 

75 alternative views of the video program. 

[0044] A hierarchical summary description has an attribute hierarchyType, which specifies the type of inten-elation 
between different levels of the sunnmary. The hierarchyType can be independent or dependent. If hierarchyType is inde- 
pendent, the infomnation in a hierarchk^l summary level completely specifies a particular summary, without reference 
to the infonmation in its parent element If hierarchyType is dependent, information in a hierarchical summary level adds 

20 to, or refines, the information in its parent element; i.e., the summary at a particular level can't be reconstructed without 
knowledge of the parent element. 

[0045] The following is an example of a hierarchk:al sunnmary according to the above-described sunrvnary descrip- 
tion schemes that contains a highlight summary The definition of highlight level Is given below. The highlight summary 
may, for example, contain interesting video clips, ordered in multi-level fashion. Since the hierarchyType of this Hierar- 
25 chical summary is dependent, a highlight summary at level n+1 adds more video clips to the highlight summary at level 
n. Thus, each level accumulates more infomnation to provide a longer and more extensive video summary. 

(Hierarchicat summary name^'mySummary" hierarchyType=*dependent") 

30 <Highlighti.evel)...^ighlighti.eveD 

(/Hierarchical summary) 

[0046] The hierarchk^al summary level description scheme is used to specify a summary at a particular level of 

3$ detail The hierarchical summary level description scheme is an abstract scheme from which two types of summary 
description schemes are derived, either a highlight level description scheme, or a multiresolution description scheme. 
Multiresolution description schemes are outside the scope of this disclosure and Is only included for completeness. The 
hierarchical summary level may contain zero or more hierarchk»l summary level elements as its children. 
[0047] As mentioned above the highlight level description scheme is used to specify a summary by referring to a 

40 sequence of audio-visual segments and their key-frames. A highlight level refines a hierarchical summary level and 
contains a single summary or part of a summary. A highlight level contains a sequence of references to video segments 
and their representative key-frames. A locator specifies each video segment and another locator specifies each repre- 
sentative k^-frame. A highlight level has a required attritnjte fiame, and an attn'bute level, which specifies the level of 
this summary in the hierarchy. It also has an attribute duratk>n, whteh specifies the total duration of the summary at the 

45 same level in the hierarchy. 

[0048] The following is an example of a simple highlight summary according to the above-described summary 
description schemes with a duration of 1 0 seconds. It consists of two video clips, the first from frame 0 to 120 and the 
second from frame 200 to 380. The k^-frame for the first video clip is fiame 60 and the key-frame for the second clip 
Is fiame 320. Note that a key-frame may be a frame that is visualized t)efore the video segment Itself is played; e.g., 

so playback of the vkleo segment is activated by the user dk^klng on the key-frame. 
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<Hierarchical summary name=**mySummar/ 8ummaryTypes*highSghr> 
<H^hlightL0vei name«"10 second trighOghf dufationa*1(r> 
^^IghlightSd^men^ 

<VideoSegmentLoca!oi> 

<Mafiarime>0 t20<Mediarime> 
</VkteoS6fflnCTtLocatDf> 

<lm308lj0C3t0f> 

<MedaTmrm>^k/MedaTime> 
^QfIm<>gfll,ocatDf> 
</HlghJi9htSegm6nt> 
<f QghltghtS69ni6n^ 

<VideoSegm8ntLjOcator> 

<Me(Sarfme>200 380</MecSaTvn6> 
</VideoSegmentLocatof> 
<lmageLocatDr> 

<Me(fiaTime>320<^edaTime> 
</lmageLocatOf> 
^HtghiightSegmen^ 
</HighfightL0vel> 
</hBerarchical summafy> 



[0049] The following is an example of a set of twro summailes according to the above-described summary descrlp- 
30 tion schemes, one being an altemative summary to the other. The first highlight summary is 4 seconds long and con- 
tains only a single video clip, while the second summary is 10 seconds long and contains three video clips. Note that 
both summaries share one video dip; there is a hierarchical structure evident in the underlying data, yet its represen- 
tation is not hierarchical. 

^ <Wefarchical summary narne='mySummar/ summaryTypes'highfighr> 

<^figMghtLevel names''4 second highlQhf durationa''4'> 
<Highll9htSe9m0nt> 

<VideoSegmentLocaior> 
^ <Mecfiartfne>10Q01i20<MecSarane> 

</VIdeoSegmentLocatoi> 
<lmageLacator> 

^edlaTime>10akMediarffne> 
^ <flmageLocatOf> 

</HighIjghtSegment> 



so 



55 



10 



EP1 085 756A2 



<MghlIghtL6V6i> 

^fghOg^itLavel ndmea*lO second highOghf dufadort^0% 
<Highii9htS69ni6n^ 

<VideoSegmentljOcatior> 

<MediaTrme>200 290</Mecfertfne> 
<A/fdeoSegmentLocatoi> 
<lmageLocator> 

<MediaTime>a0O<Me(fiarune> 
V!mageLocator> 
</H}ghiJghtSegmen^ 
<HigMghtSdgmen^ 

<VideoSegmentLocalDr> 

<MediaTimo1000 1120<Mediarvn8> 
<VrdeoSegmentLocalof> 
<lmageLocator> 

<Mediartme>t060<MediaT1me> 
</lmageLocator> 
</HjghllghtSegment> 
<HigMIghtSegment> 

<VideoSegmentijocatof> 

<Me<fiaTime>t200 129(k/Medbrffne> 
</VtdeoSegmentLocator> 
<lfnageLncHtof> 

<Me<faTirne>120(MKtecBaTime> 
<^fnag8Locator> 
^/HigMightSegnien^ 
</HighBghtL6¥8^ 
<fl"6ef3fchic8l 8Ufivn&fy> 



[0050] The following is an example of the same set of two summaries according to the ak)ove-described summary 
40 description schemes, one being an alternative summary to the other. However, they are now hierarchically represented 
in the description, such that the application is infomned of the underiying hierarchy in the data. In this example, the hier- 
archy type is "independent", which means that the common video dip must be repeated on the finer level, because the 
information on the finer level must be interpreted independentiy. 

^ <Hierarchicalsumniafynam6»*fnySunu7iar/ 

summa/yType-'higWighr hiefan:hyTyp0s'Iridepeffuleiit% 
<HighlightLevel nameB"4 second highlighf durath3na''4*> 
<HighBghtSoyn iont> 



55 



11 



EP1 085756A2 



<ViileoSegfnentLocator> 

<Me<fiaTime>lOOO 1120<MediaTim6> 
<^tdeoSegmantLocaion> 
<Iinagrtu3catof> 

<MecSaTim8>1 QGO</Medanme> 
</lmageLocatof> 
</HighngtnSegm6nt> 

<HighQgh&Mi nafnes*10 second rnghfighf duratfonanoS 
<HighlightSegment> 

^/ideoSegmemLocator> 

<Me(fiaTim6>200 290</Me(fiaTirra> 
^VideoSegmentLocalDr> 
<lmageLocator> 

<Mediartfne>200</MecfiaTime> 
</!mageLocator> 
</HtgMightSegment> 
<HlghlightS63men^ 

<VideoSegmentLocator> 

<MedaTime>1000 1120</MediaTim8> 
^VtdeoSegmentLocator> 
<tfnageLocatDf> 

<Me<fiaTim6>1 060</Me<toTima> 
</ltnagcLoc8rtOT> 
</HighBghtS0gin6n^ 
<HighlighlSGgmen^ 

^/idaoSegmentLacaloi> 

<Mediarflme>1200 1290<MedtaTime> 
<Afidoo Soyinc i itL oca to f> 
<linjig6Loc<itoy> 

<Me(fiaTIme>1 200</Me(fiaTiine> 
<Annag6ljocatDf> 
^ffHighllg htSflginfl fl^ 
</HlghfightLflveb 

<MerarcMcai summary 



[0051] The foilowing is a more complex example of a hierarchical summary according to the above-described sum- 
mary description schemes consisting of the same video clips, organized into two levels. At the highest level, the sum- 
mary has a duration of 4 seconds and consists of only one video clip. At the second level, the summary has a duration 
of 1 0 seconds and consists of three video dips. Note that both sunrvnaries again share one video clip as In the previous 
example, but it is specified only once, by utilizing a dependent hierarchical representation (hierarchy Type Is depend- 
ent). 



12 



EP10a5756 A2 



<Meraichical summary namea'mySurnmaf/ 

summaryTypea'highlighf h>erarchyType»*dependenf > 
<HighiightLavd name3*4 second highBghf dura!iona*4'> 

<Kghl!ghtL6vel names*10 second highfighf dura!ion«*1(r> 
<Hi9hfi^tScyiTusi it> 

<VideoSegmentLocator> 

<Me(fiaT[me>2Q0 290</MedbTim8> 
</V1deoSegmantLocaton> 
^mageLocatDf> 

<Me<Sarime>200</MecSaTtme> 
</lmageLocatDr> 
</HighItghtSegnient> 
<MghLightLevei> 
<HtghBghtSegmen^ 

<VideoSegmentL.ocator> 

<Mediarune»1QQ0 1ia)</MedaTime> 

</VideoSegmentLocatDr> 

<lmagsLocatOf> 

<M8diaTim8>1060</MedaTirne> 

</lmageLocaton> 
<MghfightSegment> 

^fighBghtLsvel name=*10 second MghBgM* duation«*1(r> 
<HghBghtSegment> 

<VideoSegmentLjocatDi> 

<Medarim8>1200 1290</Medarune> 
<A/ideoSegmentLocaSOf> 
<lmageLocatoy> 

<Me(Sarffiie>1200<Medar(me> 
</lmageLocal0r> 
</HrghHghtSegmenb» 
</HighUghtLflvel> 
^ighQghtLevel> 
<ftfierarchical suntmary> 



[0052] The following is an example of a different set of two summaries, one being an atternative summary to the 
other, ordered in a two-level hierarchy. In this case, the video clips on the finer level are sub-dips of the single dip on 
the coarse level. By utilizing the hierarchical representation, the application is informed there is some type of hierarchi- 
cal relation between the two summaries. However, the hierarchy type is 'independent*, since the clips on the finer level 
do not I'lterally include the dip on the coarse level (the information at the coarse level cannot be reused). 
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<Kerarchtcai summaiy nanie=*mySumma// 

summaryTypealiighDghf h»raic%Typesnridependeritr> 
^4^)9htLevel names'^io second summar/ diifatkm3*1(r> 

<HighlightS69'^^'^ 

<VideoS6gmentLocatOf> 

<Me(fiaTime>1000 1300<MedaTune> 
</VideoSegmentLocator> 
<ifnageLocator> 

<MediaTime>1060</Me<fiaTim6> 

<flinagel-ocgttnf> ' 

<HiyhD9htL0vel naine=MO second summar/ duialtor»»*10r> 
<i-Bghfi9htSeginen^ 

<VkieoSeginentLocatof> 

<Medaruna>1000 1090<MedaTime> 
<A/kteoSegnientLocalDf> 
<lmageLocator> 

<Medbrune>10%<MediaTime> 
</lmageljocatof> 
<A4igh^htSesmen^ 
<HighfightSeginen^ 

<VMeoSegmentLocator> 

<MedaTime>1090 12lO<MediaTIma> 
</VideoSegmentLocatDr> 
<imag8Locator> 

<MediaTime>1 12(MMedaTTfnfl> 
</lmageLocator> 
</Hi9hllghtSegment> 
<H{ghiightSegment> 

<VkieoSegmentljocator> 

<MedlaTime>1210 13OO<AlediaT!nn0> 
<A/ideoSegmentLocator> 



<lmagd.ocator> 

<Medzarti7!Q>1270</M8cfiaTlme> 
</!ma9eLoca!or> 
</HighBghtSegman& 
</HIghnghtL0vel> 
</KghSghtL0vel> 
</Hierarchical sumfraiy> 
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[0053] The following is an example of a set of two highlights according to the above-described summary description 
schemes referring to particular events in a program, in particular *slam dunks" and "three-point shots' in a baslcetball 
game video. The first sunranary contains two video clips, each showing a slam-dunic; the second summary contains two 
video dips, each showing a three-point shot By grouping the dtps into summaries of events, a user nrmy choose to view 
only the dips of slam-dunks; alternatively, the user may view all three-point shots. Note that in this case» there is no 
notion of hierarchy in the underiying real-worid events. 



<Hi8rarchical sumniafy name-'mySummaiy summafyTy(»s*Mghi!ghr> 
<i4ghiigMLjdvel namea'Sam dijnlcs*> 
<HighOghtS0sni6n^ 

<Vid8oS6yinontLoC3tDr> 

<Mediartm8>600 68(k/MedbTffne> 
</VideoSegm6ntLocatDf> 
<Inrmg8Locator> 

<MecfaTirnfl>590<AlecfaTirn6> 
</1mag8Locatoi> 
</hfighOghtSegmen^ 
<High09htSo9n6n^ 

<Vld6o S o ginontt jo c a to y> 

<Mediartme>1200 1380</Mec8arifnB> 
<A/ideoS6gmentijocatof> 
<!n)ageLocator> 

<Mediartfne>1320</MecSaTinifi> 
</lmageLocator> 
</HighlightSegment> 
<ff'fighBghtijBVQl> 

<HfghllghtLevel nam8=Three-pomt shcts"> 
<<HighfightSegmen^ 

<V}deoSegmerTtLocatof> 



40 



45 



50 



55 



15 



EP10a5 756 A2 



<Me(Sarime>2S00 2680<MecSaTirn6> 
</VideoSegm8ntljocatDi> 
<imageLDcator> 

<Me(faTim6>2590</Me(fiaTirne> 
</1mageLocator> 
</hfighSghtSegmen^ 
<HighBghtSegment> 

<VideoSegmentljocatDr> 

<MecSaTime>32Q0 3380</Me(fiaTifne> 
<A/ideoSegm6ntLoca!of> 
<lmageLocator> 

<MediaTinm>3320</Me<SaTirne> 
</IinageLocator> 
<</HlghiightSeginen^ 
</KghBghtL0vel> 
<^iefarchical summafy> 



[0054] Having seen several examples of the hierarchical summary description scheme and rts related components, 
it is helpful to look at the sequential summary description scheme 56 shown in Rgure 4. The sequential summary 
description scheme is used to specify summaries of an audio>visual item consisting of an arbitrary but predetermined 
sequence df still Images or video frames, which can be visualized sequentially In time. The playback speed of video 
franies can be controlled to enable smart fast-forwarding. 

[0055] A sequential summary refines a summary and contains a single audio-visual summary. It contains either a 
sequence of references to still images, or a sequence of video frames. A sequential summary may contain a sequence 
of references to audio-clips. Audio-clips may be played back In synchronization with the video frames. 
[0056] The following is an example of a simple sequential summary according to the at>ove-described summary 
description schemes, representing an animated slide-show. It refers to a number of images, which may be shown In 
sequential fashion, or under control of the user. 

<3equentialSummary name="mySummar/ summaryType="sequenitial") 

<lmageLocatorXMediaUrWle://images/phcto1.jpgC^ediaUrl)^tmageLocator> 
<lmageLocator)(MediaUrDfiIe://images/photo2.jpg{^edlaUri)(^lmageLocator) 
(lmageLocatorXMediaUrDfile://lmages/photo3.jpg{^ediaUrix/lmageLocator) 
<lmageLocatorXMedlaUrl)file://lmages/photo4.jpg(^ediaUriX^lmageLocator> 
<lmageLocatoiXMediaUri)file7/lmages/photo5.jpg(^MediaUr|}^tmageLocator> 
<lmageLocatorxMediaUrl^le://image&^photo6.jpg^ediaUrl)^lmageLocator> 

(/SequentialSummary) 

[0057] All of these summaries can be presented as alternatives to the user. The user selects the type of summary 
desired, based upon the type of media representation desired and the level of specificity. As discussed in detail above, 
the representation can be one of several different types, with multiple levels and can be either dependent or independ- 
ent 

[0058] The presentation of these summaries as well as the transfer and communications t>etween the various enti- 
ties involved in this presentation is shown in Rgure 6. These are achieved by using common description schemes such 
as the above-mentioned titie descriptions and summary descriptions. The content creator/provider 62 provides the 
audiovisual programs and other data services (metadata) associated with those programs to the service provider 64. 
The cteta services may include such things as directories of key dips, or other types of indexes of the audiovisual pro- 
gram, for example, such as Indexes to segments containing touchdowns and field goals in a football game. The servk^ 
provider 64 may originally prepare summary description and text information in respect to an audiovisual program pos- 
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sessed by the content provider 62. The service provider 64 and the content provider 62 negotiate some type of fee 
arrangement for this transfer The content provider 62 may also be the service provider 64. The user can select an audi- 
ovisual program based on the nrtetadata such as title and summary descnptions provided by the service provider 64. 
The metadata given to the audiovisual program has the same structure as data used for the title and summary descrip- 
5 tions. Hence, an audiovisual program may have metadata provided from plural service providers. The user at the sys- 
tem side can adaptively obtain a desired content by his/her preference for the summary and as the result of connparison 
of means, expenses and time duration necessary for obtaining the program by utilizing a variety of summaries for the 
same audiovisual programs. 

[0059] The service provider 64 then sends the summary description using the above-described summary descrip- 

10 tion scheme about the various audiovisual programs to the user 60. The service provider also tracks the resources at 
the user's end and the user preference and history. Tracking such infonnation is desirable for the service provider to 
offer the user descriptions for summaries that are desirable to the user and usable by user's platform. The user can then 
receive summary descriptions according to the users preference that operate on the content provided by the content 
provider. There is again some fee arrangement between the servk» provider and the user. 

15 [0060] The content provider may also track the user preferences and usage history to directly deliver the summa- 
rized programs to the user 60. In this case, the summary descriptions reside at the content provider, and the content 
provider selects, according to user data, generates and directly delivers the appropriate program summary to the user. 
[0061] In this particular example, the user tmnsacts separately with the content provider and the servk:e provider 
for content and summary servk^es, respectively. However, ail the functionality provided within box 70 could be provided 

20 by either the service provider or the content provider. Some content providers may decide to offer these servbes, as 
well as some servk^e providers deciding to offer content It is also possible that the user has arrangements witii other 
provkiers. For one serv»e or type of content the two provkiers could be separated, for other servbes or types of con- 
tent, the two could be combined together. In this case, where the sennce and content providers are the same, the user 
preferences and usage history would be sent only once. 

25 [0062] The user may also interact with other users to exchange information by using the above-described descrip- 
tion scheme. For example, the user may have the capability to produce customized audiovisual program summaries 
(e.g., "My Favorite Moments') at the user end. The user could then pass these customized summary descriptions to 
other users 66 to share experiences or to make reviews and recommendations atx)ut a particular program. Other users 
could then receive sumnnary descriptions that operate on these programs provided by a content provider. Alternatively, 

30 one user transfers only a description data of a customized summary for an audiovisual program to the other user that 
can then directiy refer to and view the audiovisual program specified by the customized summary. 
[0063] In this manner, the descriptive framework is used to provide summary descriptions. These summary 
descriptions can then be used to present alternative summaries of audiovisual content to the user. The content and the 
summary descriptions are provided according to an arrangement of transfers and transactions. 

35 [0064] Thus, although there has been described to this point a particular embodiment for a method and structure 
for as description framework for audiovisual presentation systems, it is not intended that such spectfk: references be 
considered as limitations upon the scope of this Invention except in-so-far as set forth in the following claims. 

Claims 

40 

1. A system operable to provide a description framework about programs presented t>y an audiovisual system, the 
frameworic including a descriptive structure operak>le to klentify and locate each of the audiovisual representations 
of audiovisual material 

45 2. The system of claim 1 , wherein the description f rameworic includes at least one multimedia title description. 

3. The system of claim 2, wherein the titie description included in the description framework collectively descn'bes text 
data of a title of an audiovisual content and multimedia data representing a content of at least one audk>visual pro- 
gram. 

so 

4. The system of claim 2, wherein the titie description included in the description framework includes information for 
locating multi'media data and is capable of handling multimedia data being outside the audiovisual system. 

5. The system of daim 1 , wherein the description framewori( includes at least one summary descriptk>n. 

55 

6. The system of claim 5, wherein the summary descriptton included in the description framework comprises a com- 
bination of at least one piece of partial data obtainable by extracting a part of multimedia data composing the audi- 
ovisual content and is described using infonnation for locating the multimed'a data and the partial data extracted 
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7. The system of daim 5, wherein the summary description included in the description framework includes multimedia 
data and Infonmation for locating to extract a partial multimecfia data from the multimedia data and capable of han- 

5 dling multimedia data being outside the audiovisual system. 

8. The system of daim 1 , wherein the description frameworic includes one of either a multimedia titie description or a 
summary description and at least one other component. 

10 9. A method of presenting summaries of audiovisual content to a user, the method comprising the steps of: 

a) presenting a multi-view menu of the available types of summaries to the user, wherein the multi-view can 
provide hierarchical and non-hierarchical summaries; 

b) receiving a user selection of a summary type; and 

IS c) providing summaries of the selected type to the user. 

10. A method of providing summary description services of audiovisual content to a user (60), comprising the steps of: 

a) receiving information from a user (60), wherein said information indudes specifications of platform 
20 resources at the user end and user preferences; 

b) tracking usage history of the user (60); 

c) transmitting audiovisual material to the user (60); and 

d) provKling summary descriptions operable to be applied to the audiovisual material to the user (60). 

25 11. The method of claim 10 wherein said receiving and providing steps are perfonrted by a servk^ provider (64). 

12. The method of claim 1 1 wherein said transmitting audiovisual content step is performed by a content provider (62). 

13. The method of claim 1 0 wherein the steps are perfomrted by a combination content and servk^e provider (62, 64). 

30 

14. The method of claim 10 wherein the summary descriptions provided to the user (60) are of a fonnat allowing the 
user (60) to exchange summary descrfptbns with other users (66). 

15. The method of daim 14 wherein the format alkswing the user (60) to exchange summary descriptions also allows 
35 the user (60) to customize summary descriptions. 
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